Spock - a Spoken Corpus Client

نویسندگان

  • Maarten Janssen
  • Tiago Freitas
چکیده

Spock is an open source tool for the easy deployment of time-aligned corpora. It is fully web-based, and has very limited server-side requirements. It allows the end-user to search the corpus in a text-driven manner, obtaining both the transcription and the corresponding sound fragment in the result page. Spock has an administration environment to help manage the sound files and their respective transcription files, and also provides statistical data about the files at hand. Spock uses a proprietary file format for storing the alignment data but the integrated admin environment allows you to import files from a number of common file formats. Spock is not intended as a transcriber program: it is not meant as an alternative to programs such as ELAN, Wavesurfer, or Transcriber, but rather to make corpora created with these tools easily available on line. For the end user, Spock provides a very easy way of accessing spoken corpora, without the need of installing any special software, which might make time-aligned corpora corpora accessible to a large group of users who might otherwise never look at them.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Annotations for Dynamic Diagnosis of the Dialog State

This paper describes recent work aimed at relating multi-level dialog annotations with meta-data annotations for a corpus of real humanhuman dialogs. This work is carried out in the context of the AMITIES project in which spoken dialog systems for call center services are being developed. A corpus of 100 agent-client dialogs have been annotated with three types of annotations. The first are utt...

متن کامل

El-WOZ: a client-server wizard-of-oz interface

In this paper, we present a speech recording interface developed in the context of a project on automatic speech recognition for elderly native speakers of European Portuguese. In order to collect spontaneous speech in a situation of interaction with a machine, this interface was designed as a Wizard-of-Oz (WOZ) plateform. In this setup, users interact with a fake automated dialog system contro...

متن کامل

Vague Language and Interpersonal Communication: An Analysis of Adolescent Intercultural Conversation

This paper is concerned with the analysis of the spoken language of teenagers, taken from a newly developed specialised corpus the British and Taiwanese Teenage Intercultural Communication Corpus (BATTICC). More specifically, the study employs a discourse analytical approach to examine vague language in an intercultural context among a group of British and Taiwanese adolescents, paying particul...

متن کامل

The Effect of CMC in Business Emails in Lingua Franca: Discourse Features and Misunderstandings

The paper argues that everyday exchange of business emails produces a development in the work-group relationship, which, in turn, makes new communication styles possible and acceptable by the users' habit to computer-mediated forms, even in unbalanced professional exchanges. The focus is on the (spoken) discourse features of email messages in a self-compiled corpus of selected computer-mediated...

متن کامل

The Assessment of Pragmatic Knowledge in the Online General IELTS-Practice Resources: A Corpus Analysis of Writing Tasks

Motivated by the concept of Communicative Language Ability and the eminence of the IELTS exam, this study intended to scrutinize the representation of functional knowledge (FK) and socio-linguistic knowledge (SK) as sub-components of pragmatic knowledge in the writing performances of both tasks of the online General IELTS-practice resources across three band scores. This quantitative inter-scor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008